AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

arXiv.org Artificial IntelligenceNov-17-2025

Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction

Xu, Jun, Du, Xinkai, Ao, Yu, Zhao, Peilong, Li, Yang, Zhong, Ling, Yuan, Lin, Bo, Zhongpu, Wang, Xiaorui, Sun, Mengshu, Gui, Zhengke, Zhang, Dalong, Wang, Zhaoyang, Wang, Qiwei, Hou, Yangyang, Yin, Zhiying, Wang, Haofen, Chen, Huajun, Liang, Lei, Zhou, Jun

Efficient retrieval of external knowledge bases and web pages is crucial for enhancing the reasoning abilities of LLMs. Previous works on training LLMs to leverage external retrievers for solving complex problems have predominantly employed end-to-end reinforcement learning. However, these approaches neglect supervision over the reasoning process, making it difficult to guarantee logical coherence and rigor. To address these limitations, we propose Thinker, a hierarchical thinking model for deep search through multi-turn interaction, making the reasoning process supervisable and verifiable. It decomposes complex problems into independently solvable sub-problems, each dually represented in both natural language and an equivalent logical function to support knowledge base and web searches. Concurrently, dependencies between sub-problems are passed as parameters via these logical functions, enhancing the logical coherence of the problem-solving process. To avoid unnecessary external searches, we perform knowledge boundary determination to check if a sub-problem is within the LLM's intrinsic knowledge, allowing it to answer directly. Experimental results indicate that with as few as several hundred training samples, the performance of Thinker is competitive with established baselines. Furthermore, when scaled to the full training set, Thinker significantly outperforms these methods across various datasets and model sizes. The source code is available at https://github.com/OpenSPG/KAG-Thinker.

computational linguistic, large language model, machine learning, (16 more...)

2511.07943

Country:

Europe (1.00)
North America > United States (0.67)
Asia > Middle East > UAE (0.46)

Genre:

Research Report (0.81)
Workflow (0.68)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Neural Information Processing SystemsMay-26-2025, 17:12:06 GMT

Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

This paper considers the Pointer Value Retrieval (PVR) benchmark introduced in [ZRKB21], where a reasoning' function acts on a string of digits to produce the label. More generally, the paper considers the learning of logical functions with gradient descent (GD) on neural networks. It is first shown that in order to learn logical functions with gradient descent on symmetric neural networks, the generalization error can be lower-bounded in terms of the noise-stability of the target function, supporting a conjecture made in [ZRKB21]. It is then shown that in the distribution shift setting, when the data withholding corresponds to freezing a single feature (referred to as canonical holdout), the generalization error of gradient descent admits a tight characterization in terms of the Boolean influence for several relevant architectures. This is shown on linear models and supported experimentally on other models such as MLPs and Transformers.

artificial intelligence, machine learning, unseen data and boolean measure, (5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Neural Information Processing SystemsOct-9-2024, 19:20:03 GMT

Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

generalization error, gradient descent, unseen data and boolean measure, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Jung, Inkee, Lau, Siu-Cheong

Logifold: A Geometrical Foundation of Ensemble Machine Learning

arXiv.org Artificial IntelligenceJul-23-2024

Abstract--We present a local-to-global and measure-theoretical approach to understanding datasets. The core idea is to form ulate a logifold structure and to interpret network models with restricted domains as local charts of datasets. In particul ar, this provides a mathematical foundation for ensemble machi ne learning. Our experiments demonstrate that logifolds can b e implemented to identify fuzzy domains and improve accuracy compared to taking average of model outputs. Additionally, we provide a theoretical example of a logifold, highlighting t he importance of restricting to domains of classifiers in an ens emble.

accuracy, dataset, logical function, (14 more...)

2407.16177

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Jung, Inkee, Lau, Siu-Cheong

A logifold structure on measure space

arXiv.org Artificial IntelligenceMay-8-2024

In this paper, we develop a local-to-global and measure-theoretical approach to understand datasets. The idea is to take network models with restricted domains as local charts of datasets. We develop the mathematical foundations for these structures, and show in experiments how it can be used to find fuzzy domains and to improve accuracy in data classification problems.

dataset, linear logical function, vertex, (16 more...)

2405.05492

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > South Carolina > Richland County > Columbia (0.04)
North America > United States > Pennsylvania (0.04)
(5 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Parsa, Atoosa, Witthaus, Sven, Pashine, Nidhi, O'Hern, Corey S., Kramer-Bottiglio, Rebecca, Bongard, Josh

Universal Mechanical Polycomputation in Granular Matter

arXiv.org Artificial IntelligenceMay-28-2023

Unconventional computing devices are increasingly of interest as they can operate in environments hostile to silicon-based electronics, or compute in ways that traditional electronics cannot. Mechanical computers, wherein information processing is a material property emerging from the interaction of components with the environment, are one such class of devices. This information processing can be manifested in various physical substrates, one of which is granular matter. In a granular assembly, vibration can be treated as the information-bearing mode. This can be exploited to realize "polycomputing": materials can be evolved such that a single grain within them can report the result of multiple logical operations simultaneously at different frequencies, without recourse to quantum effects. Here, we demonstrate the evolution of a material in which one grain acts simultaneously as two different NAND gates at two different frequencies. NAND gates are of interest as any logical operations can be built from them. Moreover, they are nonlinear thus demonstrating a step toward general-purpose, computationally dense mechanical computers. Polycomputation was found to be distributed across each evolved material, suggesting the material's robustness. With recent advances in material sciences, hardware realization of these materials may eventually provide devices that challenge the computational density of traditional computers.

frequency, particle, vibration, (14 more...)

doi: 10.1145/3583131.3590520

2305.17872

Country:

Europe > Portugal > Lisbon > Lisbon (0.06)
North America > United States > Connecticut > New Haven County > New Haven (0.05)
North America > United States > Vermont > Chittenden County > Burlington (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.48)

arXiv.org Artificial IntelligenceMay-22-2023

Logical Entity Representation in Knowledge-Graphs for Differentiable Rule Learning

Han, Chi, He, Qizheng, Yu, Charles, Du, Xinya, Tong, Hanghang, Ji, Heng

Probabilistic logical rule learning has shown great strength in logical rule mining and knowledge graph completion. It learns logical rules to predict missing edges by reasoning on existing edges in the knowledge graph. However, previous efforts have largely been limited to only modeling chain-like Horn clauses such as $R_1(x,z)\land R_2(z,y)\Rightarrow H(x,y)$. This formulation overlooks additional contextual information from neighboring sub-graphs of entity variables $x$, $y$ and $z$. Intuitively, there is a large gap here, as local sub-graphs have been found to provide important information for knowledge graph completion. Inspired by these observations, we propose Logical Entity RePresentation (LERP) to encode contextual information of entities in the knowledge graph. A LERP is designed as a vector of probabilistic logical functions on the entity's neighboring sub-graph. It is an interpretable representation while allowing for differentiable optimization. We can then incorporate LERP into probabilistic logical rule learning to learn more expressive rules. Empirical results demonstrate that with LERP, our model outperforms other rule learning methods in knowledge graph completion and is comparable or even superior to state-of-the-art black-box methods. Moreover, we find that our model can discover a more expressive family of logical rules. LERP can also be further combined with embedding learning methods like TransE to make it more interpretable.

artificial intelligence, logic & formal reasoning, machine learning, (16 more...)

2305.12738

Country:

North America > United States > Texas (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (0.84)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Association Learning (1.00)
(2 more...)

Schetinin, Vitaly, Li, Dayou, Maple, Carsten

An Evolutionary-Based Approach to Learning Multiple Decision Models from Underrepresented Data

arXiv.org Artificial IntelligenceMay-24-2008

The use of multiple Decision Models (DMs) enables to enhance the accuracy in decisions and at the same time allows users to evaluate the confidence in decision making. In this paper we explore the ability of multiple DMs to learn from a small amount of verified data. This becomes important when data samples are difficult to collect and verify. We propose an evolutionary-based approach to solving this problem. The proposed technique is examined on a few clinical problems presented by a small amount of data.

dms, evolutionary algorithm, machine learning, (16 more...)

0805.3800

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
North America > United States > Wisconsin (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.69)

Rebotier, Thomas P., Elman, Jeffrey L.

Explorations with the Dynamic Wave Model

Neural Information Processing SystemsDec-31-1996

Following Shrager and Johnson (1995) we study growth of logical function complexity in a network swept by two overlapping waves: one of pruning, and the other of Hebbian reinforcement of connections. Results indicate a significant spatial gradient in the appearance of both linearly separable and non linearly separable functions of the two inputs of the network; the n.l.s. cells are much sparser and their slope of appearance is sensitive to parameters in a highly nonlinear way.

exploration, logical function, probability, (13 more...)

Country: North America > United States > California > San Diego County > La Jolla (0.05)

Technology: Information Technology > Artificial Intelligence (0.71)